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ABSTRACT 


Certain  methods  of  analysis  for  testing  the  hypothesis 
that  organized  motion  exists  in  the  airflow  above  naturally 
occurring  water  waves  are  examined.   In  particular,  two  cur- 
rent methods,  spectral  analysis  and  joint  probability  density 
function  analysis,  are  briefly  discussed;  two  new  methods, 
matched  filters  and  parametric  time  series  analysis,  are 
suggested . 

An  analysis  is  done  with  parametric  time  series  analysis 
on  wind-wave  data.   The  results  show  the  tractability  of 
this  data  to  parametric  methods,  and  the  results  are  in 
agreement  with  previous  spectral  analyses  of  the  data  in  the 
regions  of  overlap. 
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I.   INTRODUCTION 

In  recent  years,  interest  in  parameterizing  and  defining 
processes  in  the  air-sea  boundary  layer  has  intensified.   Re- 
cent theoretical  models  [Miles,  1957]  postulate,  among  other 
things,  that  the  airflow  near  the  surface  and  the  waves  are 
coupled  in  such  a  manner  as  to  produce  organized  motion  in 
the  air  in  relation  to  the  wave  field.   This  results  in  air 
motion  with  the  same  period  as  the  dominant  period  of  the 
waves.   This  paper  examines  certain  methods  of  analysis  by 
which  these  hypotheses  can  be  tested.   In  particular,  two 
current  methods  are  discussed,  two  new  methods  suggested,  and 
an  analysis  using  one  of  the  new  methods  is  performed. 
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IX.   BACKGROUND 

If  organized  motion  exists  in  the  air  due  to  the  presence 
of  the  underlying  waves,  its  existence  and  nature  should  be 
reflected  in  observations  of  the  wind  over  the  waves.   Unfor- 
tunately for  a  prospective  analyst  of  such  observations,  this 
regime  is  characterized  by  the  presence  of  a  general  turbulent 
field  which  serves  to  mask  the  organized  motion.   The  problem, 
then,  is  to  discover  whether  organized  motion  does  exist,  and 
if  it  does,  its  nature,  in  the  midst  of  random  turbulent  in- 
fluences.  This  should  be  determined  from  data  obtained  over 
natural  waves . 

It  is  the  usual  case  that  data  of  interest  take  the  form 
of  simultaneous  continuous  time  traces  of  variables  of  inter- 
est, like  wind  components  and  wave  heights.   For  purposes  of 
ease  of  analysis,  such  records  are  usually  subsequently  dis- 
cretized  so  that  the  final  form  is  that  of  a  multivariate 
discrete  time  series.   This  is  the  case  with  data  used  in 
this  paper.   That  data  was  obtained  from  observations  over 
natural  waves  on  Lake  Michigan;  the  data  are  fully  described 
elsewhere  [Davidson,  1970]  . 

Time  series  differ  from  other  types  of  statistical  sample 
realizations  in  that  the  observations  are  assumed  to  be  de- 
pendent upon  one  another,  whereas  in  the  larger  body  of  sta- 
tistical technique,  it  is  preferred  that  observations  be 
independent.   The  exploitation  of  this  expected  dependence 
is  a  distinguishing  characteristic  of  time  series  analysis. 
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As  is  suggested  by  the  form  of  the  data,  the  majority  of  the 
techniques  considered  here  are  based  on  time  series  analysis. 
One  method,  that  of  joint  probability  distribution  function 
analysis,  is  not  drawn  from  the  body  of  time  series  analysis, 
but  it  also  uses  the  dependent  nature  of  the  data  to  advan- 
tage, though  indirectly. 

A.   SPECTRAL  ANALYSIS 

Most  widely  known  of  the  time  series  analysis  techniques 
is  spectral  analysis.   This  technique  examines  the  data  in 
the  frequency  domain.   The  most  common  representation  of  spec- 
tral results  is  the  sample  spectrum,  whose  magnitude  for  a 
given  frequency  reflects  the  amount  of  variance  in  the  data 
which  can  be  accounted  for  by  the  presence  of  a  cosine  compon- 
ent at  the  given  frequency.   It  is,  in  fact,  the  Fourier 
cosine  transform  of  the  estimated  autocovariance  function. 
Hence,  the  spectrum  of  data  arising  from  a  process  dominated 
by  a  periodic  effect  should  exhibit  a  marked  peak  at  the  fre- 
quency of  the  periodic  effect,  while  the  spectrum  of  white 
noise  should  be  flat.   Thus  noise  should  raise  the  level  of 
the  spectrum,  but  should  not  obscure  significant  peaks.   Gen- 
erally, if  the  hypothesized  relationship  hold,  one  would  ex- 
pect a  peak  in  the  wind  component  spectra  at  the  same  frequency 
as  the  major  peak  in  the  wave  spectrum. 

Spectral  analysis  also  has  a  natural  extension  into  multi- 
variate space,  called  cross-spectral  analysis.   As  the  co- 
variance  of  wind  and  waves  is  of  primary  interest  here, 
calculation  of  cross-spectra  should  reveal  some  information 
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of  interest,  including  estimates  of  phase  relationships,  co- 
herence, and  cross-coyariance  with  respect  to  frequency.   Pre- 
dictions from  theoretical  models  for  these  yalues  may  then  be 
compared  to  the  estimates  resulting  from  actual  observations 
as  an  indication  of  the  validity  of  the  theory. 

Spectral  analysis  is  not  without  major  drawbacks,  however. 
A  major  handicap  is  that  it  can  only  deal  adequately  with 
stationary  time  series.   Often  natural  time  series  are  not 
observed  to  be  stationary,  and  may  require  elaborate  algo- 
rithms to  be  applied  in  an  effort  to  remove  trends  and/or 
seasonality  to  obtain  s tat ionar ity .   A  second  difficulty  is 
the  lack  of  smoothness  in  actual  sample  spectra.   A  typical 
sample  spectrum  is  so  jagged  that  interpretation  is  difficult. 
The  usual  remedy  is  to  smooth  the  data,  smooth  the  spectrum, 
or  both.   This  makes  the  information  more  readily  apprehended, 
but  the  complex  effect  of  such  smoothing  on  individual  esti- 
mates of  spectral  ordinates  makes  the  statistical  significance 
of  the  estimates  difficult  or  impossible  to  determine.   It 
may  also  result  in  a  loss  of  information  contained  in  the 
data.   Further,  while  confidence  intervals  for  unsmoothed 
estimates  of  spectral  ordinates  may  be  calculated  theoretical- 
ly, theory  also  shows  that  the  error  in  the  estimate  of  the 
spectral  ordinate  will  be  of  the  same  order  of  magnitude  as 
the  spectral  ordinate  itself  {Kendall  and  Stuart,  1966]. 
These  facts  make  it  difficult  to  quantitatively  support  the 
belief  that  a  particular  peak  in  a  smoothed  spectrum  is 
significant.   Thus  Kendall  and  Stuart  [1966]  rightly  caution 
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meteorologists  and  oceanographers  by  name  about  assuming  the 
existence  of  periodic  elements  in  the  data  generating  process 
solely  on  the  basis  of  spectral  analysis;  care  is  necessary. 
Further  discussions  of  other  possible  pitfalls  as  well  as  much 
more  complete  discussions  of  spectral  analysis  are  to  be  found 
in  Blackman  and  Tukey  {1958],  Kendall  and  Stuart  [1966],  and 
Jenkins  and  Watts  [1968]  . 

Although  unsatisfying  in  some  statistical  senses,  spectral 
analysis  is  a  useful  tool  to  be  applied  to  looking  for  wind- 
wave  coupling.   The  data  used  in  this  paper  was  extensively 
analyzed  spectrally  by  Davidson  [1970],  with  good  results. 

B.   MATCHED  AND  WIENER  FILTERS 

Another  type  of  analysis  related  to  spectral  analysis 
which  may  hold  promise  for  the  wind-wave  coupling  problem  is 
to  be  found  in  the  repertoire  of  electrical  engineering. 
Electrical  engineers  are  frequently  faced  with  the  problem 
of  detecting  and  isolating  a  signal  in  the  midst  of  noise, 
precisely  analogous  to  looking  for  organized  motion  in  the 
air  in  the  midst  of  turbulence.   To  solve  this  problem,  two 
types  of  spectrally  derived  filters  have  been  developed,  de- 
signed to  give  a  large  response  to  signal  and  a  small  res- 
ponse to  noise.   They  are  matched  filters  and  Wiener  filters. 
To  the  best  knowledge  of  the  author,  no  one  has  yet  applied 
such  filters  to  wind-wave  data  looking  for  organized  motion, 
but  Sokol  £  19  7 1 J  applied  this  technique  to  detection  of  ther- 
mal plumes  in  the  air  adjacent  to  the  sea  from  temperature 
traces  of  a  type  similar  to  the  wind-wave  data.   His  results 
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were  quite  encouraging  for  the  hypothesis  that  such  an  analy- 
sis might  be  useful  in  the  detection  of  organized  motion  in 
turbulence . 

At  present,  the  major  problem  with  the  application  of 
these  techniques  to  wind-wave  data  appears  to  be  sensitivity 
to  small  errors  in  estimates  of  certain  parameters  which  are 
difficult  to  estimate  accurately,  like  the  phase  of  the  per- 
iodic effect.  However,  the  magnitude  of  such  problems  is  as 
yet  unknown,  and  does  not  preclude  at  least  a  comprehensive 
preliminary  examination  of  these  techniques  for  practicality 
with  respect  to  the  wind-wave  problem. 

C.   JOINT  PROBABILITY  DENSITY  FUNCTION  ANALYSIS 

The  one  analysis  technique  currently  used  in  this  field 
which  is  not  properly  a  member  of  time  series  analysis  is 
joint  probability  density  function  analysis.   Basically,  this 
method  examines  the  data  in  the  probability  domain.   In  the 
analysis,  the  probability  density  function  of  the  set  of 
joint  occurrences  of  two  variables  is  calculated.   Equal 
joint  probability  density  contours  are  plotted  on  a  two  di- 
mensional array.   The  contours  are  then  subjectively  analyzed 
for  deviations  from  the  concentric  circle  pattern  of  a  theo- 
retical random,  independent  joint  probability  density  func- 
tion, the  current  reference  function  being  the  bivariate 
standard  normal  joint  probability  density  function.   The 
pattern  of  these  deviations,  which  is  assumed  to  arise  be- 
cause of  the  dependent  nature  of  the  time  series,  is  then 
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evaluated  in  terms  of  the  generating  process  and  compared  to 
what  is  expected  by  theory.   This  type  of  analysis  has  been 
applied  to  meteorological  data  by  Holland  [1972],  Davidson 
and  Frank  11972],  and  Frank  11971]  . 

This  method  shows  great  promise;  however,  it  lacks  refine- 
ment as  yet.   The  choice  of  the  bivariate  standard  normal 
joint  probability  density  function  as  a  reference  distribution 
is  formally  incorrect,  although  it  may  be  satisfactory  in 
cases  involving  many  degrees  of  freedom.   The  proper  choice 
is  a  bivariate  distribution  analogous  to  the  Student-t  distri- 
bution in  the  univariate  case.   Using  such  a  distribution, 
described  in  some  detail  in  Birnbaum  [1962],  it  appears  quite 
feasible  to  develop  an  objective  test  of  the  significance  of 
deviations  from  the  distribution  expected  if  the  generating 
processes  were  random  and  independent.   Further  tests  against 
such  a  reference  distribution  reduce,  effectively,  to  t-tests 
for  correlation  coefficients  and  regression  coefficients  cal- 
culated between  the  two  variables  [Birnbaum,  1962,  pages  223- 
242].   It  would  appear  beneficial  to  include  at  least  these 
last  tests  as  a  normal  part  of  the  analysis. 

D.   PARAMETRIC  TIME  SERIES  ANALYSIS 

The  final  method  is  that  advanced  by  Box  and  Jenkins 
[1970],  called  by  some,  parametric  time  series  analysis. 
This  method  involves  fitting  a  stochastic  model  to  the  time 
series,  while  allowing,  in  keeping  with  modern  statistical 
thought,  the  data  themselves  to  shape  the  model  and  the  analy- 
sis itself  to  a  relatively  great  extent.   This  method  is 
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relatively  new;  its  capabilities  and  limitations  are  not  com- 
mon knowledge  in  the  manner  that  those  of  spectral  analysis 
are.   In  order  to  better  understand  these  capabilities  and 
limitations,  particularly  as  they  apply  to  the  boundary  layer 
data  in  this  study,  it  is  necessary  to  discuss  the  farm  of 
the  models  to  be  fitted.   This  is  done  in  Sections  1  through 
3,  which  describe,  though  with  considerably  less  detail,,  the 
ideas  expressed  by  the  primary  source,  Box  and  Jenkins r  text 
[1970].   With  this  background  already  given,  the  particular 
applications  to  wind-wave  coupling  investigations  are  dis- 
cussed in  Section  E. 

1 .   Yule's  Proposition  x 

Probably  no  process  generating  experimental  observa- 
tions is  completely  deterministic.   Thus,  deterministic 
models  are  not  usually  adequate  in  analyses,  although  physi- 
cists and  others  have  successfully  used  them  in  analyses  of 
data  generated  by  an  apparatus  designed  to  be  so  accurate 
that  noise  contributions  to  the  observations  are  negligible 
in  relation  to  results  of  interest.   But  if  the  noise  is  of 
a  significant  magnitude,  as  with  turbulence  data,  models  in- 
cluding noise  must  be  used.   If  the  probability  structure  of 
the  process  is  of  interest,  as  in  the  case  of  atmospheric 
turbulence  studies,  the  normal  course  is  to  use  a  stochastic 
model.   The  model  fitted  by  parametric  time  series  analysis 
is  s  tochas tic . 

Central  to  parametric  time  series  analysis  is  Yule's 
proposition  [Yule,  1927]  that  a  time  series  whose  elements 

18 


are  highly  dependent,  as  is  the  case  with  wind-wave  records, 
may  usefully  be  viewed  as  having  arisen  from  the  application 
of  a  linear  filter  to  a  parallel  series  of  independent  random 
shocks  drawn  from  a  fixed  distribution.   This  viewpoint  is 
purely  heuristic,  and  need  not  correspond  to  the  actual  pro- 
cess at  all  to  be  useful.   The  proposition  is  symbolized  in 
Figure  la. 

In  applying  this  idea,  the  fixed  distribution  of  the 

series  of  random  shocks  will  be  taken  to  be  normal  with  mean 

2 
zero  and  variance  a   .   With  the  random  shock  series  repre- 

a 

sented  by  a  's,  the  observed  time  series  by  z  's,  and  the 

J       t  '   t 

linear  filter  elements  by  ip.   's,  the  relationship  can  be 


symbolized  as 


zfc  =  y  +  afc  +  lP1at_1    +    ^2at-2  +  ••• 
=  y  +  (1  +  lp  B  +  \p2B2    +    ...)  a 


=  y  +  41(B)  at 
where  B  is  the  standard  backward  shift  operator,  and 


(2.1) 


t     t-s 


(2.2) 

If  the  sequence  ip(B)  is  finite,  or  infinite  and  convergent 
for  B  given  a  value  of  unity,  the  filter  is  said  to  be  con- 
vergent.  If  the  filter  is  convergent,  then  z(t)  is  a  station- 
ary time  series,  and  y  is  the  mean  of  the  series.   If  the 
filter  is  unstable,  y  is  an  arbitrary  reference  point  for  the 
process  level,  and  z(t)  is  non-stationary. 
2 .   The  ARIMA  Model 

Parametric  time  series  analysis  is  based  on  fitting 
a  particularly  useful  subset  of  linear  filters  of  the  kind 
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a.  Yule's  Model 
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Figure  1.   Filter  Models 


<-  after  Box  &  Jenkins  [1970] 
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just  described.   This  class  of  models  is  known  as  the  autore- 
gressive  integrated  moving  average  (ARIMA)  class.   As  the  name 
suggests,  this  model  incorporates  three  separate  features, 
the  autoregressive  terms,  moving  average  terms,  and  integra- 
tion operator.   The  model  is  easier  to  understand  if  one  dis- 
cusses each  part  separately,  building  the  total  structure  as 
new  features  are  discussed. 

The  autoregressive  model  associates  the  current  value 
of  the  process  with  a  linear  combination  of  past  values  of 
the  process  and  a  current  random  shock.   An  autoregressive 
process  of  order  p,  AR(p),  would  be: 

2t  =  *1  St-1  +  *2  5t-2+-"+  *iZt-V+    at      (2'3) 
where  z   =  z   -  u.   If  one  considers  the  past  values  of  z   as 

the  "independent"  variables,  then  one  has  the  standard  linear 


regression  relation,  hence  the  name  autoregressive.   The  p+2 

i 
P 


2 
parameters:  (J>1  ,$„,...,  <J>  ,  U ,  O  ,  must  generally  be  estimated 


directly  from  the  data. 

It  is  clear  that  by  substituting  back  within  the 
model  (Equation  2.3)  for  the  past  values  used  by  the  model, 
e.g. 

5t-l  =  *l£t-2+---+  Vt-p-1  +  at-l  (2'4) 

one  can  eventually  produce  a  representation  of  the  AR(p)  pro- 
cess as  an  infinite  series  in  the  a  's,  and  hence,  as  a 
linear  filter  of  the  type  described  previously.   Writing  the 
Ar(p)  process  as 

$  CB)2   =  a„  C2.5) 

p     t     t 
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and 


then  one  obtains 


KB)at  =  zt  (2.6) 


6  "1(B)  =  ipCB)  (2.7) 

with  the  same  constraints  for  stationarity  of  tJj(B)  as  noted 
previously . 

The  second  important  process  is  the  moving  average 
process  of  order  q,  MA(q)  which  relates  the  current  value  of 
z   to  past  values  of  the  a  's.   The  MA(q)  process  may  be  re- 
presented as 

5t  =  at  "  61  at-l  "  92  at-2-----°qat-c,  (2-8) 
Writing  the  MA(q)  process  as: 

z.t  =  9(B)  afc  ,  (2.9) 

it  is  immediately  seen  that  this  process  already  has  the 

form  of  a  linear  filter  where 

ij;(B)  =  9  (B)  .  (2.10) 

In  a  formal  statistical  sense,  the  form  described  above  is 

not  confined  to  moving  averages  of  the  a  's,  since  the  9.'s 

need  not  be  positive,  nor  sum  to  one,  but  this  is  standard 

nomenclature  nonetheless.   The  q+2  parameters  of  the  model: 

2 
0_  ,  9?,...,  9  ,  y,  a   ,  again  generally  depend  directly  on 

the  data  for  their  estimation. 

These  two  processes  may  be  mixed  to  achieve  parsimony 

of  the  model  Cdiscussed  in  a  later  section).   The  result  is 

an  autoregressive  moving  average  process  of  order  (p,q)  that 

is,  ARMA  (p,q): 
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z   =  <J>.  z^  .,  +  ...+  i  z    f  a  -  0..  a    -...-0  a      ,_    . 
t    Tl   t-1       Tp  t-p    t    1   t-1       q  t-q   (2.11) 

or 

<j>  (B)  zt  =  9q  afc.  (2.12) 

Here  a  total  of  p  +  q+2  parameters:  <}),...,  <f>  ,  0  ,...,  0  ,  u, 

2 

0   ,  are  required,  and  in  general  must  all  be  estimated 

3 

directly  from  the  data.   If  q  is  finite,  0  (B)  is  a  stable 
filter.   For  the  equivalent  filter  of  the  ARMA  (p,q)  process 
to  be  stable,  then,  it  is  sufficient  to  require  (J)    (B)  to 
be  stable,  as  the  ARMA  (p,q)  equivalent  filter  is  a  simple 
infinite  series: 

iJi(B)  =  *p"1(B)  9  (B)  (2.13) 

Further,  it  is  clear  that  the  ARMA  (o,q)  model  is  equivalent 

to  the  MA(q),  and  the  ARMA  (p,o)  to  the  AR(p)  when  all  are 

applied  to  the  same  z  's  and  a  's. 

t  t 

For  reasons  of  simplicity  and  stability  of  calcula- 
tion, it  is  useful  to  require  that  the  only  ARMA(p,q)  proces- 
ses considered  be  those  which  are  stable,  and  hence  would 
result  in  a  stationary  z(t)  if  applied  to  a  sequence  of  ran- 
dom shocks,  a(t).   But  many  observed  time  series  are  inade- 
quately fitted  by  such  an  ARMA  (p,q)  model  because  they  are 
non-stationary  in  nature.   Many  of  these  may  be  adequately 
fitted  by  using  a  generalized  autoregressive  operator  (p(B) 
in  place  of  (J)  (.B)  .   The  result  will  be  an  autoregressive  in- 
tegrated moving  average  model.   This  is  shown  schematically 
in  Figure  lb. 
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It  can  be  shown  that  the  roots  of  the  equation 

Q    «    <f>pCB) 

m    I    +    <J>       B    +    <J>2    B    +  .  .  .+    <j>      BP  (2.14) 

where  B  is  now  considered  in  the  same  fashion  as  a  variable 

in  a  polynomial,  must  lie  outside  the  unit  circle  for  <}>    (B) 

to  be  stable.   It  can  further  be  shown  that  if  any  of  the 

roots  lies  inside  the  unit  circle  for  either 

0  =  <J>  (B)  (2.15) 

or 

0  =  0  (B)  (2,16) 

q 

then  the  ARMA(p,q)  model  exhibits  explosive  growth,  thus  vio- 
lent non-s ta tionar i ty .   However,  it  turns  out  to  be  very  use- 
ful in  treating  many  types  of  non-s tati onarity  to  allow  one 
or  more  of  the  roots  of 

Cp(B)  =  0  (2.17) 

to  be  equal  to  unity.   Letting  the  V  operator  symbolize  this: 

V  =  (1  -  B)  (2.18) 

one  can  then  write  ^(B)  as: 

<p  (B)  =  $  (B)  Vd 

=  <j>  (B)(l-B)d  (2.19) 

where  d  is  the  number  of  roots  of  Q'(B)  set  equal  to  unity. 

Thus,  the  autoregressiye  integrated  moving  average  model  may 

be  written : 

<P(B)z.  =  B  CB)  a.  (2.20) 

t     q      .t 


or 


<j>   (B)  Vd  2,  =  6  (B)  av  (2.21) 

p  t      q       t  . 
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This  model  can  be  symbolized  by  ARIMA  (p,d,q).   Substituting 

Wt  i  Vd  zt  (2.22) 

produces  a  process  in  the  w  's  which  can  be  represented  by  a 
stable  equivalent  filter,  since  the  process  may  be  written 

<J>  (B)wt  =  9qCB)at  (2.23) 

which  is  a  simple  ARMA  (p,q)  process  in  the  w  fs.   This  re- 
veals the  types  of  non-s tationari ty  which  can  be  handled.   In 
order  for  Equation  (2.23)  to  be  true,  w(t)  must  be  a  station- 
ary series.   Hence,  any  series  z(t)  which  can  be  reduced  to 
stationarity  by  forward  differencing  by  the  V   operator  can 
be  adequately  fit  by  such  a  model.   This  includes  any  series 
whose  "trend"  can  be  adequately  approximated  by  a  polynomial 
of  degree  d  or  less. 

We  note  that  the  term  integrated  arises  because 

-1  t 

Vz^  =  S'z^  =   Z   zu  (2.24) 

t      t    ,      n 

and,  for  example 

~       t     t     t 

S  z  =       I  I  I       zu  (2.25) 

t  ,      n  . 

i  =  _O0   -|=r_00  Yl=—0° 

Thus,  in  forecasting  a  future  value  of  z  ,  say  z    ,  one  uses 

the  summation  operator  to  go  from  the  predicted  and  derived 

w  's  to  the  predicted  z    's. 
t  t"rl 

There  is  still  one  important  characteristic  that  a 
time  series  may  have  that  is  useful  to  be  able  to  handle; 
seasonality.   Many  treatments  of  seasonality  are  possible 
within  the  framework  exhibited  so  far.   This  discussion  will 
be  limited  to  the  multiplicative  technique  described  below. 
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Box  and  Jenkins  [197QJ  should  be  referenced  for  more  detailed 
information  on  both  this  and  other  techniques. 

The  multiplicative  technique  is  a  simple  extension  of 
the  ARIMA  (p,d,q)  model.   The  basic  assumption  is  that  the 
variation  between  observations  separated  by  an  increment  of 
time  equal  to  the  period  of  the  seasonal  component  can  be 
adequately  fitted  by  the  same  sort  of  models  as  used  on  adja- 

g 

cent  observations.   Hence,  replacing  B  by  B   in  the  ARIMA 
(p»d,q)  model,  we  get  a  seasonal  representation: 

$p(BS)  V°  zt    =  GQ(BS)  at  (2.26) 

or 

wt  =  <^ibS'1+- • •+$pbS'P)  wt 

-  (®1BS,1+...+  0Q  BS'Q)  at+  at  (2.27) 

t     1  t-s         P  t-P  •  s      t 

-  @nW    -.  .  .-  6>  W  n  (2.28) 

1  t-s        Q  t-Q-s  v 


where 

Wt  -  V°  zt  -  (1  -  BS)D  2t 

But  it  is  also  clear  that  the  process  will  also  de- 
pend on  adjacent  values,  so  the  complete  multiplicative  model, 
the  ARIMA  (p,d,q)  x  (P,D,Q)  is: 

cf>p(B)$pCBS)Vg    it    =    6q(B)     ©q(BS)at  (2.29) 

This  result  is  a  very  powerful  class  of  models  for  fitting 
naturally  occurring  time  series.   This  is  the  class  upon 
which  parametric  time  series  analysis  is  based. 
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3 .   Transfer  Function  Models 

Box  and  Jenkins  [1970J  further  discuss  transfer  func- 
tion models  designed  to  fit  cases  in  which  two  processes  are 
related  by 
(1+E-l  D+...+HRDR)  Y(t)  =  (H0+H1  D+...+  HsDS)X(t-l)      (2.30) 

where  D  E  d/dt,  the  H's  and  H's  are  unknown  parameters,  and 
T  is  the  dead-time  or  delay  in  the  system.   Where  Y   and  X 
are  discrete  time  series  measured  at  equispaced  times,  (2.30) 
becomes 

(2.31) 


(1+S-V+...+E  Vr)Y^  =  (nn+n1v+. .  ,+n  vs)x. 

1        r     t      u   1        s     t-b 


where  D  is  replaced  by  V  =  1-B  as  before.   Substituting 

B  =  1-V,  we  obtain 

(1-61B-.  .  .-6rBr)Yfc  -  (u0-  cu1B-.  .  .-a)sBS)Xt_b 


or 


S(B)Y   =  w(B)B  X 


=  fi(B)X, 


(2.32) 


As  in  the  ARIMA  (p,d,q)  case,  this  relation  may  be  written  in 

the  form  of  a  linear  filter: 

-1 


Y   =  6   (B)fl(B)X   =  v(B)X 


(2 .33) 

which  is  stable  if  the  v(B)  series  converge  for  B  given  a 
numerical  value  such  that  0  <,     [b|  ^.  1.   It  is  to  be  noted  that 
the  transfer  function  model  and  the  stochastic  ARIMA  models 
are  somewhat  parallel. 

Now,  remembering  the  discussion  on  the  lack  of  deter- 
ministic time  series,  it  is  necessary  to  consider  a  stochastic 
transfer  function  model  by  adding  superimposed  noise,  N  , 
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to  the  process,  where  N   and  X   are  independent.   Hence,  we 
obtain 

Yt  -  vCB)Xt  +  Nt  (2.34) 

for  the  stochastic  model.   But  there  is  no  reason  to  suppose 
that  the  process  does  not  effect  the  noise  in  some  way,  or 
that  the  noise  contribution  is  unstructured.   Therefore,  the 
idea  is  advanced  that  the  noise,  N  ,  will  be  represented  as 
having  the  form 

Nt=  *(B)at-  9_1(B)e(B)at  (2.35) 

where  the  \p '  s ,  Q  '  s  ,  and  9's  are  as  before.   This  form  includes 
unstructured  noise,  the  raw  random  shocks,  a  's,  as  a  single 
subset  of  the  possible  structures.   Hence,  the  model  one  would 
be  led  to  fit  is 

Yt=  v(B)Xt+iKB)at 

=  6_1(B)fi(B)Xt+  9?~1(B)9(B)a      (2.36) 
This  is  shown  schematically  in  Figure  lc. 

E.   APPLICATIONS  TO  WIND-WAVE  COUPLING 

For  an  application  of  these  techniques  to  the  problem  of 
detection  of  hypothesized  organized  motion  in  air  adjacent  to 
natural  waves,  it  seems  clear  that  a  transfer  function  stoch- 
astic model  is  likely  to  give  the  most  direct  information  as 
to  the  structure  of  the  relationship  between  the  waves  and 
the  adjacent  airflow.   One  would  fit  the  transfer  function 
model  by  regarding  the  wave  heights  as  inputs,  X  ,  and  a  wind 
velocity  component  as  the  output,  Y  .   The  existence  of  signi- 
ficant values  for  the  v's  would  immediately  suggest  that  there 
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are  fluctuations  in  the  oyerlying  airflow  which  are  not  random 
in  relation  to  the  waves,  particularly  if  such  values  appear 
repeatedly  for  data  from  different  observational  periods.   It- 
is  a  fortunate  circumstance  that  statistical  tests  of  signifi- 
cance for  these  parameters  exist.   However,  the  accuracy  of 
these  tests  depends  on  how  closely  the  parent  distribution  of 
the  actually  calculated  random  shocks  corresponds  to  the  dis- 
tribution picked  to  be  the  theoretical  parent  of  the  random 
shocks  in  the  model.   The  structure  of  the  fitted  model  may 
give  clues  as  to  the  nature  of  the  actual  process.   It  is  an 
unfortunate  circumstance,  however,  that  the  transfer  function 
models  are  beyond  the  scope  of  this  paper. 

The  analyses  performed  were  involved  fitting  ARIMA  (p,d,q) 
x  (P,D,Q)'s  models  to  individual  time  series.   Three  benefits 
may  be  possible  from  this  procedure.   First,  models  fitted 
individually  to  wind  components  and  wave  heights  from  the  same 
observations  may  be  compared  for  suggestions  of  interaction. 
This  type  comparison  is  difficult,  however,  because  many  indi- 
vidual models  with  differing  orders  in  the  various  types  of 
processes  may  be  quite  similar  in  result  and  fit,  depending 
on  the  data.   Hence,  for  example,  one  may  not  be  able  to  say 
that  an  ARIMA  (l,d,0)  x  (0,0,0)  represents  data  with  a  differ- 
ent structure  than  that  of  a  series  best  fit  by  an  ARIMA 
(0,d,2)  x  CO, 0,0).   The  latter  arises  because  these  two  models 
closely  correspond  for  certain  sets  of  values  of  the  para- 
meters involved. 
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The  next  benefit  is  that  the  fitted  model  may  suggest  in 
itself  a  possible  structure  for  the  process.   Also  possible 
and  more  likely  is  the  fact  that  repeated  analyses  of  this 
kind  may  help  a  researcher  following  an  adaptive  model  build- 
ing strategy  to  infer  a  general  model  for  given  conditions. 
From  this,  he  may  infer  some  of  the  structure  of  the  parent 
process.   The  researcher  is  helped  in  this  quest  by  the  sta- 
tistical confidence  intervals  it  is  possible  to  obtain  for 
parameter  estimates  and  joint  confidence  intervals  for  groups 
of  parameters  (with  the  same  caution  here  as  presented  in  the 
transfer  function  model  discussion  directly  above). 

Third,  and  seemingly  most  important,  this  study  may  give 
an  idea  of  the  usefulness  of  these  techniques  in  fitting  to 
boundary  layer  data.   The  study  intended  to  give  an  indication 
whether  such  data  is  of  the  types  that  this  sort  of  analysis 
can  handle,  for  pre-evaluat ion  of  the  usefulness  of  further 
studies  along  these  lines. 

A  final  note  of  interest  is  that  Yule's  proposition  [Yule, 
1927],  discussed  in  I-D,  may  have  a  physical  significance  for 
wind-wave  data.   It  may  be  possible  to  view  the  generalized 
turbulence  of  the  airflow  adjacent  to  natural  waves  as  a 
parent  random  process  producing  a  series  of  a  's,  and  the 
waves  as  acting  like  a  linear  filter  on  these  a  's  to  produce 
organized  motion  and  a  resultant  structuring  of  the  noise 
elements,  that  is,  the  observed  turbulence.   Recent  papers  by 
Reynolds  {1968]  and  Davidson  and  Frank  [1972J  suggest  that 
the  structure  of  the  turbulent  field  is  important  in  many 


30 


facets  of  air-sea  interaction.   Hence,  parametric  time  series 
analysis  may  hold  great  potential  in  boundary  layer  studies. 
However,  to  reiterate  a  previous  statement,  it  is  not  neces- 
sary that  there  exist  any  correspondence  at  all  between  the 
actual  generating  process  and  the  Yule  viewpoint  for  tbis  type 
of  analysis  to  be  beneficial. 
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III.   ANALYSIS  CONSIDERATIONS 

The  data  considered  were  from  an  August  19  observation 
period,  described  in  Davidson  [1970]  .   From  these  data  three 
time  series  were  chosen:   wave  height,  u-component  of  the 
wind  1.5  meters  above  surface,  and  w-component  of  the  wind  at 
1.5  meters.   ARIMA  models  were  fitted  to  each  of  these  series, 
and  examined  for  goodness  of  fit.   The  models  were  examined 
for  information  contained  in  them  about  the  structure  of  the 
time  series  to  which  they  were  fit. 

The  primary  feature  of  parametric  time  series  analysis 
is  model  fitting.   An  iterative  approach  to  model  building 
was  used  which  allows  the  data  to  influence  the  model  struc- 
ture.  The  steps  in  this  procedure  are  discussed  in  Section  A. 
Again,  a  more  complete  discussion  of  this  procedure  is  found 
in  Box  and  Jenkins'  text  [Box  and  Jenkins,  1970,  p.  18]. 

A.   MODEL  BUILDING 

In  fitting  ARIMA  models,  an  iterative  model  building  pro- 
cedure is  most  fruitful.   The  steps  of  such  a  procedure  are 
described  below  and  the  process  is  summarized  in  Figure  2. 

1 .   Selection  of  General  Class  of  Model 

The  first  step  in  model  building  is  to  postulate  a 
general  class  of  models  to  be  fitted.   By  choosing  to  perform 
parametric  time  series  analysis,  the  researcher  has  implicit- 
ly chosen  to  use  the  ARIMA  class  of  models.   However,  further 
choices  are  possible  at  this  stage.   Due  to  the  very  nature 
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of  the  problem,  one  is  virtually  forced  to  consider  the 
seasonal  extension  to  the  ARIMA  Cp,d,q)  model. 
2 .   Identification  of  Model 

The  second  step  in  the  procedure  is  the  identification 
of  a  particular  model  or  subset  of  models  from  the  general 
class  chosen.   This  process  can  be  influenced  both  by  outside 
knowledge  and  information  contained  in  the  data. 

Having  chosen  to  consider  an  ARIMA  (p,d,q)  x  (P,D,Q) 
model,  theory  suggests  that  the  order  parameter  d  should  be 
zero,  and  D  should  not.   The  need  for  d^O  would  imply  a  notice- 
able change  in  the  mean  sea  level  was  taking  place  during  the 
period  of  observation.   The  need  for  D=0  would  imply  that  no 
seasonal  change  in  the  time  series  at  all  was  taking  place 
during  the  period  of  observation.   Seasonal  variation  certain- 
ly does  occur  in  the  waves,  and  we  expect  seasonal  variation 
in  the  air. 

It  can  be  shown  that  the  stable  non-stationary  opera- 
tor V  =  (1-BS)  has  zeroes  at  e-lk27T/  S  (k  =  0  , 1 ,  .  .  .  ,  s-1)  .   The 
application  of  this  operator  to  a  time  series  will  remove  a 
seasonal  component  of  the  form 

Is/2] 


seasonal  component  =  b  +  T,  {b,.cos( J-)+b„.sin( =*-)  } 

o  .    ,   lj      s     2j      s 
J  =  1 


3 

where  the  b's  are  adaptive  coefficients  implicit  in  the  data, 
and  {S/2J  =  S/2  if  S  is  even,  or  JS/2J  =  1/2  (S-1)  if  S  is 
odd.   This  type  of  operator  is  a  parsimonious  representation 
of  seasonal  components  of  a  type  which  require  many  sine  and 
cosine  waves  to  represent  them  adequately.   An  example  of 
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such  a  seasonal  component  would  be  a  spike  at  periodic  in- 
tervals . 

Theory  predicts  that  the  main  seasonal  effect  will 
have  the  form  of  a  single  sine  wave  at  the  frequency  of  the 
water  waves.   The  V    operator  is  not  a  parsimonious  repre- 
sentation of  a  single  sine  wave.   Furthermore,  the  adaptive 
coefficients  of  at  least  some  of  the  additional  sine  and 

cosine  waves  contained  in  the  V   operator  are  likely  to  be 

s   r 

inflated  by  intermittent  and  statistically  non-significant 
periodicities  in  the  random  turbulent  component  of  the  motion. 
Therefore,  an  operator  representing  a  single  sine  wave  at  the 
desired  frequency,  adaptive  in  phase  and  amplitude  to  the  data, 

was  applied  in  place  of  the  V   operator  in  some  models.   This 

2 
operator  is  [1-2  cos  (2tt/p)B  +  B  ],  where  p  is  the  desired 

period.   This  stable  non-stationary  operator,  which  has  zeroes 

+  i27i/p    ,  -  .,  ,  ,_ 

at  e—      ,  will  remove  a  seasonal  component  of  the  type: 

i  i  ./2ttXi1  /2ttn 

seasonal  component  =  bn  sm( )+  b_  cos  ( ) 

1      p      I  p 

=  b »  sin( +  5 ) 

3      P 

where  the  b's  are  adaptive  coefficients,  6  is  an  adaptive 
phase  angle,  and  p  is  the  desired  period. 

Parsimony  is  a  primary  consideration  at  this  stage. 
Parsimony  involves  making  use  of  the  most  efficient  type  of 
model  possible.  For  example,  while  an  AR(1)  process  can  be 
theoretically  represented  by  an  infinite  MA  process,  and  by  a 
finite  MA(q),  q  >  1,  process  in  practice,  it  is  most  parsi- 
moniously represented  by  the  AR(1)  model  involving  only  one 
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parameter  as  opposed  to  an  MA(q)  process  involving  q  >  1  para- 
meters.  Similarly,  an  AR(p)  model  is  not  a  parsimonious  re- 
presentation of  an  MA(1)  process,  nor  is  either  an  AK,  or  MA 
model  a  parsimonious  representation  of  a  mixed  ARMA  process. 

Hence,  it  is  necessary  to  go  to  the  data  for  an  indi- 
cation of  the  values  of  p,q,P,  and  Q  which  are  best  fit, 
remembering  that  in  practice  a  value  of  two  or  less  for  each 
is  normally  sufficient.   To  get  this  indication,  one  examines 
the  autocorrelation  function  and  partial  autocorrelation 
function.   A  brief  discussion  of  what  one  looks  for  is  to  be 
found  in  Appendix  A.   Estimates  of  p,q,P  and  Q  are  made,  and 
rough  estimates  of  the  fitted  parameters,  the  <j> '  s  ,  $'s,  6's, 
and  ©'s,  are  made. 

3 .  Est  imat ion 

The  third-  stage  of  the  procedure  is  to  fit  the  model 
to  the  data.   The  rough  estimates  of  the  fitted  parameters  are 
now  used  as  initial  guesses  for  iterative  estimation  routines. 
At  this  stage,  the  best  fit  for  the  model  chosen  is  obtained. 

4 .  Diagnostic  Checks 

The  final  stage  is  diagnostic  checking  to  determine  if 
the  fitted  model  is  adequate.   If  it  is,  the  procedure  ends, 
and  the  model  is  chosen.   If  the  model  is  determined  to  be  in- 
adequate or  unsatisfactory,  the  process  is  iterated  from  the 
second  step  until  a  suitable  model  is  either  found  or  deter- 
mined to  be  non-existent. 

A  more  complete  discussion  of  the  diagnostic  tools 
used  is  to  be  found  in  Appendix  B. 
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B.  PRELIMINARY  ANALYSES 

The  preliminary  analysis  produced  estimated  autocorrela- 
tion function  values,  estimated  autocovariance  function  values, 
and  estimated  partial  autocorrelation  function  values  of  the 
time  series  which  resulted  from  the  application  of  the  V  V 
operator,  with  various  values  of  d  and  D,  and  appropriate 
values  of  s,  to  the  initial  data.   These  estimated  function 

values  were  also  calculated  for  the  time  series  resulting  when 

D  2 

V   was  replaced  by  [1-2  cos  (2tt/p)B  +  B  ],  the  less  general 

operator  discussed  in  A-2.   From  these  estimated  function 

values,  the  proper  types  and  degrees  of  differencing  needed 

to  produce  a  stationary  time  series  were  estimated.   Further 

analysis  of  the  estimated  function  values  for  the  properly 

differenced  series  resulted  in  estimates  for  the  proper  values 

of  the  parameters  p,q,P,  and  Q,  for  the  multiplicative  ARIMA 

(p,d,q)  x  (P,D,Q)  model  to  be  fit.   The  set  of  the  most  likely 

model  candidates  was  the  result. 

C.  ANALYSES 

The  preliminary  analysis  yielded  the  set  of  the  most  like- 
ly model  candidates;  the  parameters  required  by  these  models, 
the  (J> '  s ,  $'s,  etc.,  were  estimated  by  mapping  the  maximum 
likelihood  surface  of  the  parameters,  as  reflected  by  the  sum 
of  the  squares  of  the  estimated  a  's  (that  is,  the  sum  of  the 
squares  of  the  residuals).   Thus,  .these  estimates  were  "least 
squares"  estimates,  and  this  method  is  an  accepted  routine 
for  non-linear  least  squares  estimation.   Diagnostic  checks 


37 


were  then  applied  to  the  a  's  resulting  from  the  use  of  the 
best  estimates  for  the  parameters  in  each  model.   These  diag- 
nostic checks  included  comparing  the  magnitudes  of  the  sums 
of  the  squares  of  the  residuals  of  different  models,  calculat- 
ing the  autocorrelations  of  the  residual  series,  and  calculat- 
ing a  chi-square  test  for  adequacy  of  fit  [Box  and  Jenkins, 
1970,  p.  291].   These  diagnostic  checks  were  then  compared  in 
an  effort  to  determine  the  best  model  of  those  examined. 

Unfortunately,  the  author  was  unable  to  get  a  Marquardt 
maximum  descent  nonlinear  least  squares  algorithm  to  work. 
Had  this  been  done,  reasonably  accurate  confidence  intervals 
and  joint  confidence  intervals  for  parameter  estimates  would 
have  been  easily  available  in  addition  to  the  speedier  arri- 
val at  the  least  squares  estimates.   As  this  was  not  done, 
alternate  methods  for  obtaining  these  estimates,  set  forth 
in  Box  and  Jenkins  [1970,  p.  228],  were  explored.   Consider- 
ing the  lack  of  real  need  for  such  interval  estimates  in  this 
study,  due  to  the  small  number  of  time  series  analyzed  and 
subsequent  preclusion  of  a  meaningful  attempt  at  an  adaptive 
model  building  strategy,  these  methods  were  found  to  be  too 
inefficient  and  time  consuming  to  be  of  sufficient  relative 
value  to  attempt.   Hence,  although  no  confidence  intervals 
were  estimated,  their  exclusion  was  not  significant  in  this 
study . 

In  summary,  the  estimated  autocorrelation  functions, 
estimated  autocovariance  functions,  and  estimated  partial 
autocorrelation  functions  of  the  time  series  resulting  from 
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application  of  the  difference  .operators  to  the  original  time 
series  were  used  to  identify  the  proper  degree  of  differenc- 
ing and  the  most  likely  types  of  models  to  be  fit  to  the  pro- 
perly differenced  series.   The  parameters  of  these  models  were 
estimated  by  a  nonlinear  least  squares  mapping  technique.   The 
residuals,  the  a  's  resulting  from  each  type  of  model  when  the 
least  squares  estimates  for  the  model's  parameters  were  used, 
were  tested  for  indications  of  the  model's  goodness  of  fit. 
The  results  of  these  tests  were  compared,  and  the  best  model 
was  chosen . 
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IV.  '  RESULTS 

The  results  of  the  parametric  time  series  analyses  were 
in  good  agreement  with  the  spectral  analyses  done  with  the 
same  data  by  Davidson  [1970]  .   The  results  clearly  indicate 
the  tractability  of  this  type  of  data  to  this  type  of  analy- 
sis . 

A.   PERIODIC  EFFECTS 

The  V  V   difference  operator  was  applied  to  the  data  for 
several  values  of  d  and  D.   Table  I  contains  the  values  ap- 
plied.  From  both  spectral  analysis  and  autocorrelation  func- 
tion estimates,  it  was  determined  that  the  dominant  period  of 
the  waves  was  best  estimated  as  six  seconds,  corresponding  to 
a  period  of  30  lags  in  the  time  series,  while  the  air  was 
observed  to  have  a  dominant  period  of  only  4.7  seconds,  cor- 
responding to  24  lags  in  the  time  series,  in  both  the  hori- 
zontal and  vertical  components.   The  estimated  autocorrelation 
functions  of  all  three  time  series  before  differencing,  Figure 
3,  shows  the  form  of  a  damped  sine  wave,  but  the  damping  is 
not  great.   The  autocorrelations  at  high  lag  numbers  do  not 
fall  off  sufficiently  fast  to  support  an  assumption  that 
these  autocorrelation  functions  have  arisen  from  stationary 
time  series.   It  can  be  seen  in  Figure  4  that  application  of 
the  V   operator  did  not  result  in  a  picture  more  compatible 
with  the  assumption  that  stationarity  resulted.   None  of  the 
dif f erencings  involving  the  V  V   operator  were  satisfactory 
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TABLE  X 
DEGREES  OF  DIFFERENCING  EXAMINED 


VdVDz     or   I.1-2cos(2tt/p)B+  B2]z 

Si-  I- 

p=s=24  for  Airflow;  p=s=30  for  Wave  heights 


run  no. 

d 

D 

s 

[1-2cos(2tt/p)B+B  ] 

1 

0 

0 

No 

No 

2 

1 

0 

No 

No 

3 

2 

0 

No 

No 

4 

0 

1 

24/30 

No 

5 

0   • 

2 

24/30 

No 

6 

1 

1 

24/30 

No 

7 

1 

2 

24/30 

No 

8 

2 

1 

24/30 

No 

9 

2 

2 

24/30 

No 

10 

0 

0 

No 

Yes 

11 

1 

0 

No 

Yes 

12 

2 

0 

No 

Yes 

13 

0 

1 

4 

Yes 
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when  D  f    0.   Applying  the  V   alone,  with  d  =  2,  resulted  in  a 

satisfactory  pattern  for  this  data  (.Figure  5).   However,  the 

2 
difference  between  the  V   operator  and  the  non-general  opera- 

2 
tor  11-2  cos  (2tt/p)B  +  B  J,  discussed  in  III-A-2,  is  small 

for  p  =  24  for  the  wind,  and  p=30  for  the  waves.   The  [1-2  cos 

2 
(2iT/p)+  B  ]  operator  gives  a  slightly  better  result  (Figure  6) 

and  with  the  support  from  theory,  the  assumption  that  this 

operator  is  the  proper  one  seems  justified.   No  value  d  f    0 

gave  a  better  result  than  d=0  when  the  full  V  [  1-2  cos  (2tt/p  )  B 

2 
+  B  ]  operator  was  applied,  so  the  final  series  considered  was 

w   =  [1-2  cos  (2tt/p)  B  +  B2]  z 
t  t 

The  result  that  no  local  differencing  was  necessary  (that 
is,  for  V  ,  d=0),  and  that  the  best  results  arose  from  the 
application  of  the  non-general  operator  which  took  out  a 
single  sine  wave,  is  precisely  in  agreement  with  the  hypo- 
thesis concerning  the  organized  motion  considered  in  this 
paper.   The  fact  that  the  frequency  of  the  organized  motion 
in  the  air  is  not  the  same  as  the  dominant  frequency  of  the 
waves  is  not  in  good  agreement  with  the  hypothesis  considered, 
though  it  is  in  agreement  with  Davidson's  spectral  results 
[Davidson,  1970J.   The  lack  of  efficiency  in  the  V   operator 
for  production  of  stationarity  was  judged  to  be  due  to 
spurious  periodicities  in  the  random  turbulent  components  of 
the  series . 
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B.   ARMA  MODEL  FITTING 

Once  a  satisfactorily  stationary  time  series  yas  produced, 
a  closer  examination  of  the  estimated  autocorrelation  and  es- 
timated partial  autocorrelation  functions  revealed  that  these 
functions  showed  unmistakable  evidence  that  a  mixed  model, 
that  is,  one  requiring  both  autoregressive  and  moving  average 
terms,  was  required  for  each  series.   However,  the  three  ser- 
ies each  showed  different  structure,  and  were  considered 
separately . 

1 .  Wave  Model 

The  estimated  autocorrelation  function  of  the  waves 
was  most  easily  diagnosed  for  indications  of  the  proper  model. 
The  pattern  exhibited  is  that  of  an  ARMA  (p,q)  x  (P,Q)  where 
p=0,  q=l,  P=l,  Q=l.   The  pattern  was  quite  distinct,  and  over- 
fitting,  by  -letting  p=l,  q  =  2,  P=2,  q  =  2,  resulted  in  estimates 
for  the  <f>  ,  $_ ,  Q     ,     and  ©„  parameters  which  were  close  to 
zero,  while  no  significant  decrease  in  the  sums  of  the  squares 
of  the  residuals  was  achieved.   Fitting  a  smaller  model,  one 
with  fewer  parameters,  resulted  in  a  significant  increase  in 
the  sum  of  the  squares  of  the  residuals.   Hence,  an  ARMA  (0,1) 
x  (1,1)  was  considered  best,  and  the  final  fitted  model  was 

(l-$1B3°)wt  =   (i-e1B)a-<3)1B30)at 

where  $1  =  f  0.11,      0   =-0.53,  and  ©   =  -0.33. 

2 .  U-Component  Model 

The  estimated  autocorrelation  function  of  the  u-com- 
ponent  series  did  not  immediately  correspond  to  a  particular 
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model  in  the  way  that  the  waves'  function  did.   No  particular 
need  for  a  seasonal  component  to  the  model,  other  than  differ- 
encing, was  suggested.   Hence,  P  and  Q  were  initially  taken  as 
zero,  and  subsequent  fitting  of  ARMA  (p,q)  models  showed  that 
the  best  model  of  this  type  was  the  ARMA  (1,1).   The  final 
fitted  model  of  this  type  was,  after  overfitting  in  both  local 
and  seasonal  parameters, 

(l-<J)1B)wt  -  (l-61B)at 

where  <J>   =  -0.31,  and  8.  =  -0.74. 

It  was  interesting  to  note  that  the  estimated  partial 
autocorrelation  showed  large  values  at  periodic  intervals. 
There  seemed  to  be  two  periods  involved,  one  of  24  and  the 
other  of  31.   This  is  a  possible  indication  that  a  seasonal 
au toregress ive  operator  might  be  useful.   Two  such  models 
were  tried,  one  with  s=24  and  one  with  8=31: 

(l-C{)1B)(l-$1B31)wt  -  (l-eiB)at 
and 

(l-({)1B)(l-$1B24)wt  =  (l-91B)at 

Each  of  these  models  produced  improvements  in  the  diagnostic 
checks  of  the  a  's,  but  the  improvement  was  slight  enough  to 
make  it  questionable  if  these  are  parsimonious  operators. 
3 .   W-Component  Model 

The  estimated  autocorrelation  function  and  the  esti- 
mated partial  autocorrelation  function  of  the  w-component 
series  showed  no  indication  that  seasonal  parameters  other 
than  differencing  were  required.   After  overfitting  in  both 
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local  and  seasonal  parameters,  the  ARMA  (1,1)  x  (0,0)  was 
judged  best.   The  final  model  was 

Cl-*1B)wt  =  (1-^B)  at 

where  <j>  *    -0.09  and  6   «  -0.87. 

C.   SUMMARY 

All  three  time  series  were  most  adequately  reduced  to 
stationarity  by  application  of  a  difference  operator  which 
removed  a  single  sine  wave,  with  adaptive  amplitude  and  phase 
angle,  from  the  data.   The  period  of  the  motion  in  the  air 
thus  removed  differed  from  that  of  the  waves,  the  period  in 
the  air  being  4.7  seconds,  as  opposed  to  the  6.0  second 
dominant  period  in  the  waves. 

Once  reduced  to  stationarity,  the  wave  series  was  best 
represented  by  an  ARMA  (0,1)  x  (1,1)  model   The  stationary 
series  from  the  u-component  was  adequately  represented  by 
either  an  ARMA  (1,1)  x  (1,0)  or  an  ARMA  (1,1)  model.   The  w- 
component  was  best  represented  by  an  ARMA  (1,1)  model. 

The  data  proved  to  be  of  the  type  for  which  parametric 
time  series  analysis  is  useful.   The  results  show  good  agree- 
ment with  previous  results  obtained  by  spectral  analysis  of 
the  same  data  by  Davidson  {197QJ. 
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V.   CONCLUSIONS 

The  necessity  for  removing  a  single  sine  wave  from  the 
data  to  achieve  stability  strongly  suggests  that  there  was  a 
significant  sinusoidal  component  of  organized  motion  in  the 
air  flow  immediately  over  the  waves  during  the  period  in  which 
the  data  were  gathered.   That  this  sinusoidal  motion  has  a 
dominant  period  which  is  different  than  that  of  the  waves  is 
not  supportive  of  the  theories  which  predict  that  the  air 
will  have  organized  motion  with  the  same  period  as  the  waves. 
However,  it  is  noted  that  the  periods  in  question  differ  only 
by  1.3  seconds,  although  this  appears  significant  in  this 
densely  sampled  time  series,  which  had  about  25  observations 
per  period  (observations  every  .2  seconds). 

The  fact  that  none  of  the  series  reduced  to  a  random  walk 
model ,  that  is , 

zt  "  V 

after  being  differenced  to  s ta tionar ity ,  suggests  that  either 
further  organized  motion  exists,  or  that  some  sort  of  struc- 
ture is  imposed  upon  the  turbulence.   Due  to  the  fact  that 
only  one  series  of  each  type  was  considered,  the  significance 
of  the  ARMA  models  which  were  fitted  to  the  properly  differ- 
enced series  is  unknown.   Too  few  series  were  considered  to 
be  able  to  generalize  about  the  structure  of  the  turbulent 
flow  from  the  ARMA  models  derived  here.   The  possibility 
clearly  exists  that  the  repeated  fitting  of  such  models  may 
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result  in  a  formulation  of  a  general  model  by  an  adaptive 
strategy . 

The  most  important  conclusion  of  this  exploratory  ¥ork 
is  that  parametric  time  series  analysis  may  be  applied  to 
wind-wave  data  with  a  reasonable  expectation  that  results  of 
value  can  be  obtained.   It  is  noted  in  this  respect  that 
this  analysis  closely  agreed  with  the  traditional  spectral 
approach  in  their  region  of  overlap.   The  results  in  this 
study  show  that  the  examination  of  the  estimated  autocorrela- 
tion and  partial  autocorrelation  functions  of  various  degrees 
of  differencing  of  the  original  series  may  be  of  value  in 
understanding  the  structure  of  such  series,  even  if  the  actual 
fitting  of  the  ARMA  model  to  the  resulting  series  is  not 
attempted.   In  short,  the  parametric  approach  appears  to  hold 
considerable  potential  in  the  analysis  of  organized  motion 
within  turbulent  boundary  layer  data. 
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APPENDIX  A 
Model  Identification  Considerations 

The  initial  identification  of  the  proper  form  of  the  ARIMA 
to  best  fit  the  data  is  based  on  examination  of  the  estimated 
autocorrelation  function  and  the  estimated  partial  autocorre- 
lation function  of  the  series.   There  are  two  distinct  steps 
involved:  identifying  the  degree  of  differencing  necessary  and 
identifying  the  resultant  ARM A  model.   These  steps  are  des- 
cribed briefly  below;  they  are  more  fully  developed  in  Box 
and  Jenkins  [1970,  pp.  174-207;  300-334]. 

The  first  step  in  initial  model  identification  is  deter- 
mining the  degree  of  differencing  necessary  to  produce  a 

stationary  time  series  from  the  raw  data.   Thus  one  is  attempt- 

d  D 
ing  to  estimate  the  proper  values  of  D  and  d  in  the  V  V 

operator.   The  distinctive  characteristic  of  a  non-stationary 
time  series  for  identification  purposes  is  the  fact  that  the 
autocorrelation  function  does  not  go  quickly  to  zero  with 
increasing  lag  number.   The  autocorrelations  may  decrease 
with  lag  number,  but  the  decrease  will  be  closer  to  linear 
than  the  exponential  damping  expected  with  a  stationary  time 
series.   Further,  recurrent  patterns  spaced  a  fixed  distance 
in  time  (at  a  fixed  number  of  lags  apart)  may  indicate  the 
need  for  seasonal  differencing,  if  the  autocorrelation  co- 
efficients do  not  fall  off  sufficiently  fast.   Hence,  one 
examines  first  the  estimated  autocorrelations  of  the  original 
data,  and  then  successive  degrees  of  differencing,  both 
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seasonal  and  local,  until  a  pattern  consistent  with  a  sta- 
tionary series  is  found. 

The  second  step  is  to  examine  the  autocorrelations  aris- 
ing from  the  properly  differenced  series  for  indications  of 
the  resulting  ARMA  model  to  fit  to  the  differenced  series. 
One  is  aided  in  this  by  the  knowledge  of  the  behavior  of  the 
autocorrelation  and  partial  autocorrelation  functions  for 
various  models.   The  theoretical  autocorrelation  function  of 
an  AR(p)  model  tails  off  and  the  partial  autocorrelation 
function  abruptly  goes  to  zero  after  p  lags.   The  theoretical 
autocorrelation  function  of  a  MA(q)  process  goes  abruptly  to 
zero  after  q  lags,  while  the  partial  autocorrelation  function 
tails  off.   Both  functions  tail  off  in  a  mixed  model.   The 
estimated  functions  will  generally  follow  the  theoretical 
functions.   However,  the  estimates  have  large  variances,  and 
the  match  need  not  be  close.   In  general,  it  is  possible  to 
produce  an  autocovar iance  generating  function  for  any  given 
model.   This  function  can  be  used  to  show  the  form  of  the 
theoretical  autocorrelation  for  any  model.   One  can  then  com- 
pare the  estimated  autocovar iance  (or  autocorrelation)  func- 
tion to  the  theoretical  possibilities  to  find  a  reasonable 
match . 

In  summary,  the  estimated  autocorrelation  and  partial 
autocorrelation  functions  are  examined  for  characteristic 
patterns  which  indicate  the  proper  degree  of  differencing 
and  the  proper  ARMA  model  (to  be  fit  to  the  properly  differ- 
enced series).   Due  to  the  variability  of  such  estimated 
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functions,  precise  definition  of  model  is  often  impossible, 
and  the  skill  and  experience  of  the  analyst  may  be  signifi- 
cant in  obtaining  the  best  results. 
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APPENDIX  B 
Diagnostic  Checks  of  Model  Adequacy 

In  order  to  insure  that  the  model  which  has  been  fit  is 
adequate,  it  is  useful  to  have  available  diagnostic  procedures 
for  testing  the  adequacy  of  a  given  model.   Such  tests  may  be 
based  on  both  evaluations  at  the  total  model  level  and  at  the 
level  of  the  residuals  alone.   Several  diagnostic  tests  which 
were  used  are  discussed  below;  these  tests,  and  others,  are 
more  fully  described  in  Box  and  Jenkins  [1970,  pp.  287  ff]. 

A  primary  diagnostic  check  on  the  total  model  level  is 
the  comparison  of  the  magnitudes  of  the  sums  of  the  squares 
of  the  residuals  between  models.   A  significant  decrease  in 
this  value  with  a  change  in  model  indicates  that  the  original 
model  was  not  adequate.   A  second  check  on  the  whole  model 
level  is  overfitting  of  parameters.   If  one  judges,  for  exam- 
ple, that  an  ARIMA  (p,d,q)  is  adequate  while  noting  that  if 
it  were  inadequate,  it  would  likely  be  inadequate  in  the  AR 
parameter,  then  one  might  fit  an  ARIMA  (p+l,d,q)  model.   One 
could  then  compare  results  of  original  model  and  the  over- 
fitted  model  for  indications  of  the  original  model's  adequacy. 
That  is,  if  no  significant  change  in  fit  occurred,  as  would 
be  the  case  if  the  best  estimates  of  the  extra  parameters  in 
the  overfit  model  were  all  close  to  zero,  then  one  is  re- 
assured as  to  the  adequacy  of  the  model. 

To  overcome  the  need  for  the  prior  knowledge  of  the  sus- 
pected deficiencies  in  the  model,  such  prior  knowledge  being 
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implied  in  overfitting,  it  is  possible  to  use  diagnostic 
checks  on  the  residuals  generated  by  the  model.   The  auto- 
correlation function,  or  equivalently  the  spectrum,  of  the 
residuals  can  be  examined  for  indications  that  the  residuals 
generated  do  not  approximate  white  noise.   Individual  auto- 
correlations may  be  tested  for  significant  deviation  from  a 
value  of  zero  by  Student-t  tests,  using  the  appropriate  limits 
for  estimated  residuals  rather  than  those  for  white  noise. 
Finally,  the  first  K  autocorrelations  of  the  residuals  may 
be  looked  at  as  a  whole.   The  value  Q  such  that 

K    2 
Q  =  N   Z   r   (a), 


m=l 


in 


where  r   (a)  is  the  residual  autocorrelation  coefficient  at 
m 

lag  m,  and  N  =  number  of  observations  less  d,  the  differenc- 

2 
ing  parameter,  is  distributed  as  X  (K-p-q-P-Q-M) .   Here,  the 

p,q,P,  and  Q  are  as  before,  and  M=l  if  the  mean  were  removed 

from  the  initial  series  before  the  analysis,  and  M=0  otherwise 

2 
Hence,  the  final  test  discussed  here  is  an  X  -goodness  of  fit 

test . 

In  summary,  tests  may  be  made  either  between  models  or 

within  models  to  determine  the  adequacy  of  models.   Both  types 

of  tests  were  used  in  the  analysis  presented  in  this  paper. 
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